Dataset statistics
| Number of variables | 9 |
|---|---|
| Number of observations | 79215 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 5.4 MiB |
| Average record size in memory | 72.0 B |
Variable types
| Numeric | 9 |
|---|
X_38 is highly correlated with X_39 and 1 other fields | High correlation |
X_39 is highly correlated with X_38 and 1 other fields | High correlation |
X_40 is highly correlated with X_38 and 1 other fields | High correlation |
X_41 is highly correlated with X_43 | High correlation |
X_42 is highly correlated with X_44 and 1 other fields | High correlation |
X_43 is highly correlated with X_41 and 1 other fields | High correlation |
X_44 is highly correlated with X_42 | High correlation |
X_45 is highly correlated with X_42 and 1 other fields | High correlation |
X_38 is highly correlated with X_39 and 1 other fields | High correlation |
X_39 is highly correlated with X_38 | High correlation |
X_40 is highly correlated with X_38 | High correlation |
X_41 is highly correlated with X_43 | High correlation |
X_42 is highly correlated with X_44 and 1 other fields | High correlation |
X_43 is highly correlated with X_41 and 1 other fields | High correlation |
X_44 is highly correlated with X_42 | High correlation |
X_45 is highly correlated with X_42 and 1 other fields | High correlation |
X_38 is highly correlated with X_39 and 1 other fields | High correlation |
X_39 is highly correlated with X_38 and 1 other fields | High correlation |
X_40 is highly correlated with X_38 and 1 other fields | High correlation |
X_41 is highly correlated with X_43 | High correlation |
X_42 is highly correlated with X_44 | High correlation |
X_43 is highly correlated with X_41 and 1 other fields | High correlation |
X_44 is highly correlated with X_42 | High correlation |
X_45 is highly correlated with X_43 | High correlation |
X_38 is highly correlated with X_41 and 1 other fields | High correlation |
X_39 is highly correlated with X_40 | High correlation |
X_40 is highly correlated with X_39 | High correlation |
X_41 is highly correlated with X_38 and 4 other fields | High correlation |
X_42 is highly correlated with X_38 and 3 other fields | High correlation |
X_43 is highly correlated with X_41 and 2 other fields | High correlation |
X_44 is highly correlated with X_41 and 2 other fields | High correlation |
X_45 is highly correlated with X_41 and 2 other fields | High correlation |
X_38 is highly skewed (γ1 = 21.53589587) | Skewed |
df_index is uniformly distributed | Uniform |
Reproduction
| Analysis started | 2022-08-07 05:51:47.453350 |
|---|---|
| Analysis finished | 2022-08-07 05:52:00.996080 |
| Duration | 13.54 seconds |
| Software version | pandas-profiling v3.2.0 |
| Download configuration | config.json |
| Distinct | 39608 |
|---|---|
| Distinct (%) | 50.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 19803.25 |
| Minimum | 0 |
|---|---|
| Maximum | 39607 |
| Zeros | 2 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 619.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1980 |
| Q1 | 9901.5 |
| median | 19803 |
| Q3 | 29705 |
| 95-th percentile | 37626.3 |
| Maximum | 39607 |
| Range | 39607 |
| Interquartile range (IQR) | 19803.5 |
Descriptive statistics
| Standard deviation | 11433.77256 |
|---|---|
| Coefficient of variation (CV) | 0.5773684907 |
| Kurtosis | -1.199999999 |
| Mean | 19803.25 |
| Median Absolute Deviation (MAD) | 9902 |
| Skewness | 1.6561712 × 10-9 |
| Sum | 1568714449 |
| Variance | 130731155.1 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 2 | < 0.1% |
| 26407 | 2 | < 0.1% |
| 26400 | 2 | < 0.1% |
| 26401 | 2 | < 0.1% |
| 26402 | 2 | < 0.1% |
| 26403 | 2 | < 0.1% |
| 26404 | 2 | < 0.1% |
| 26405 | 2 | < 0.1% |
| 26406 | 2 | < 0.1% |
| 26408 | 2 | < 0.1% |
| Other values (39598) | 79195 |
| Value | Count | Frequency (%) |
| 0 | 2 | |
| 1 | 2 | |
| 2 | 2 | |
| 3 | 2 | |
| 4 | 2 | |
| 5 | 2 | |
| 6 | 2 | |
| 7 | 2 | |
| 8 | 2 | |
| 9 | 2 |
| Value | Count | Frequency (%) |
| 39607 | 1 | |
| 39606 | 2 | |
| 39605 | 2 | |
| 39604 | 2 | |
| 39603 | 2 | |
| 39602 | 2 | |
| 39601 | 2 | |
| 39600 | 2 | |
| 39599 | 2 | |
| 39598 | 2 |
| Distinct | 267 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -15.90735101 |
| Minimum | -17.09 |
|---|---|
| Maximum | 32.23 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 79214 |
| Negative (%) | > 99.9% |
| Memory size | 619.0 KiB |
Quantile statistics
| Minimum | -17.09 |
|---|---|
| 5-th percentile | -16.33 |
| Q1 | -16.16 |
| median | -15.99 |
| Q3 | -15.75 |
| 95-th percentile | -15.23 |
| Maximum | 32.23 |
| Range | 49.32 |
| Interquartile range (IQR) | 0.41 |
Descriptive statistics
| Standard deviation | 0.5320721989 |
|---|---|
| Coefficient of variation (CV) | -0.03344819629 |
| Kurtosis | 1140.399844 |
| Mean | -15.90735101 |
| Median Absolute Deviation (MAD) | 0.2 |
| Skewness | 21.53589587 |
| Sum | -1260100.81 |
| Variance | 0.2831008248 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| -16.11 | 1499 | 1.9% |
| -16.08 | 1453 | 1.8% |
| -16.04 | 1401 | 1.8% |
| -16.02 | 1396 | 1.8% |
| -15.99 | 1392 | 1.8% |
| -16.13 | 1376 | 1.7% |
| -16.17 | 1329 | 1.7% |
| -16.19 | 1291 | 1.6% |
| -16.23 | 1204 | 1.5% |
| -16.1 | 1135 | 1.4% |
| Other values (257) | 65739 |
| Value | Count | Frequency (%) |
| -17.09 | 1 | |
| -17.05 | 1 | |
| -17.04 | 1 | |
| -17.02 | 1 | |
| -17.01 | 1 | |
| -16.95 | 1 | |
| -16.94 | 1 | |
| -16.93 | 2 | |
| -16.91 | 1 | |
| -16.88 | 1 |
| Value | Count | Frequency (%) |
| 32.23 | 1 | < 0.1% |
| -2.65 | 61 | |
| -14.1 | 1 | < 0.1% |
| -14.12 | 1 | < 0.1% |
| -14.2 | 1 | < 0.1% |
| -14.25 | 1 | < 0.1% |
| -14.26 | 1 | < 0.1% |
| -14.35 | 1 | < 0.1% |
| -14.38 | 1 | < 0.1% |
| -14.39 | 1 | < 0.1% |
| Distinct | 264 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -15.89356599 |
| Minimum | -17.09 |
|---|---|
| Maximum | -2.65 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 79215 |
| Negative (%) | 100.0% |
| Memory size | 619.0 KiB |
Quantile statistics
| Minimum | -17.09 |
|---|---|
| 5-th percentile | -16.34 |
| Q1 | -16.16 |
| median | -15.99 |
| Q3 | -15.75 |
| 95-th percentile | -15.24 |
| Maximum | -2.65 |
| Range | 14.44 |
| Interquartile range (IQR) | 0.41 |
Descriptive statistics
| Standard deviation | 0.7055989191 |
|---|---|
| Coefficient of variation (CV) | -0.04439525525 |
| Kurtosis | 268.278863 |
| Mean | -15.89356599 |
| Median Absolute Deviation (MAD) | 0.19 |
| Skewness | 14.534386 |
| Sum | -1259008.83 |
| Variance | 0.4978698346 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| -16.11 | 1511 | 1.9% |
| -16.08 | 1488 | 1.9% |
| -16.13 | 1449 | 1.8% |
| -16.02 | 1418 | 1.8% |
| -16.04 | 1385 | 1.7% |
| -16.17 | 1329 | 1.7% |
| -15.99 | 1310 | 1.7% |
| -16.19 | 1302 | 1.6% |
| -15.95 | 1242 | 1.6% |
| -16.23 | 1221 | 1.5% |
| Other values (254) | 65560 |
| Value | Count | Frequency (%) |
| -17.09 | 1 | < 0.1% |
| -17.07 | 1 | < 0.1% |
| -16.99 | 2 | < 0.1% |
| -16.97 | 2 | < 0.1% |
| -16.93 | 1 | < 0.1% |
| -16.91 | 2 | < 0.1% |
| -16.89 | 2 | < 0.1% |
| -16.88 | 3 | |
| -16.86 | 1 | < 0.1% |
| -16.84 | 5 |
| Value | Count | Frequency (%) |
| -2.65 | 173 | |
| -14.11 | 1 | < 0.1% |
| -14.15 | 1 | < 0.1% |
| -14.16 | 1 | < 0.1% |
| -14.21 | 1 | < 0.1% |
| -14.24 | 1 | < 0.1% |
| -14.26 | 1 | < 0.1% |
| -14.31 | 1 | < 0.1% |
| -14.33 | 1 | < 0.1% |
| -14.36 | 1 | < 0.1% |
| Distinct | 263 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -16.57232961 |
| Minimum | -17.75 |
|---|---|
| Maximum | -14.78 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 79215 |
| Negative (%) | 100.0% |
| Memory size | 619.0 KiB |
Quantile statistics
| Minimum | -17.75 |
|---|---|
| 5-th percentile | -16.99 |
| Q1 | -16.81 |
| median | -16.64 |
| Q3 | -16.4 |
| 95-th percentile | -15.88 |
| Maximum | -14.78 |
| Range | 2.97 |
| Interquartile range (IQR) | 0.41 |
Descriptive statistics
| Standard deviation | 0.3444261602 |
|---|---|
| Coefficient of variation (CV) | -0.02078320721 |
| Kurtosis | 1.383041925 |
| Mean | -16.57232961 |
| Median Absolute Deviation (MAD) | 0.2 |
| Skewness | 1.086859227 |
| Sum | -1312777.09 |
| Variance | 0.1186293798 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| -16.72 | 1432 | 1.8% |
| -16.76 | 1423 | 1.8% |
| -16.7 | 1408 | 1.8% |
| -16.79 | 1380 | 1.7% |
| -16.67 | 1377 | 1.7% |
| -16.81 | 1355 | 1.7% |
| -16.63 | 1316 | 1.7% |
| -16.85 | 1306 | 1.6% |
| -16.61 | 1182 | 1.5% |
| -16.88 | 1137 | 1.4% |
| Other values (253) | 65899 |
| Value | Count | Frequency (%) |
| -17.75 | 1 | |
| -17.72 | 2 | |
| -17.69 | 1 | |
| -17.62 | 1 | |
| -17.59 | 1 | |
| -17.58 | 1 | |
| -17.56 | 2 | |
| -17.55 | 2 | |
| -17.53 | 2 | |
| -17.52 | 1 |
| Value | Count | Frequency (%) |
| -14.78 | 1 | < 0.1% |
| -14.8 | 1 | < 0.1% |
| -14.83 | 1 | < 0.1% |
| -14.88 | 1 | < 0.1% |
| -14.97 | 2 | |
| -15 | 1 | < 0.1% |
| -15.01 | 1 | < 0.1% |
| -15.05 | 1 | < 0.1% |
| -15.06 | 1 | < 0.1% |
| -15.07 | 4 |
| Distinct | 44 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 21.1871364 |
| Minimum | 20.73 |
|---|---|
| Maximum | 21.62 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 619.0 KiB |
Quantile statistics
| Minimum | 20.73 |
|---|---|
| 5-th percentile | 21.14 |
| Q1 | 21.17 |
| median | 21.19 |
| Q3 | 21.21 |
| 95-th percentile | 21.24 |
| Maximum | 21.62 |
| Range | 0.89 |
| Interquartile range (IQR) | 0.04 |
Descriptive statistics
| Standard deviation | 0.03086650198 |
|---|---|
| Coefficient of variation (CV) | 0.001456851053 |
| Kurtosis | 3.214408381 |
| Mean | 21.1871364 |
| Median Absolute Deviation (MAD) | 0.02 |
| Skewness | -0.08115085437 |
| Sum | 1678339.01 |
| Variance | 0.0009527409446 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=44)
| Value | Count | Frequency (%) |
| 21.19 | 11427 | |
| 21.17 | 10227 | |
| 21.18 | 9520 | |
| 21.2 | 8385 | |
| 21.21 | 8289 | |
| 21.16 | 6639 | |
| 21.15 | 5454 | |
| 21.22 | 5045 | |
| 21.23 | 3829 | 4.8% |
| 21.14 | 2804 | 3.5% |
| Other values (34) | 7596 |
| Value | Count | Frequency (%) |
| 20.73 | 1 | |
| 20.78 | 1 | |
| 20.81 | 2 | |
| 20.85 | 2 | |
| 20.86 | 1 | |
| 20.89 | 1 | |
| 20.97 | 1 | |
| 20.98 | 1 | |
| 20.99 | 1 | |
| 21.01 | 2 |
| Value | Count | Frequency (%) |
| 21.62 | 1 | < 0.1% |
| 21.51 | 1 | < 0.1% |
| 21.33 | 2 | < 0.1% |
| 21.32 | 6 | < 0.1% |
| 21.31 | 23 | < 0.1% |
| 21.3 | 43 | 0.1% |
| 21.29 | 65 | 0.1% |
| 21.28 | 158 | 0.2% |
| 21.27 | 288 | |
| 21.26 | 652 |
| Distinct | 45 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 21.05950237 |
| Minimum | 20.79 |
|---|---|
| Maximum | 21.44 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 619.0 KiB |
Quantile statistics
| Minimum | 20.79 |
|---|---|
| 5-th percentile | 21 |
| Q1 | 21.03 |
| median | 21.06 |
| Q3 | 21.09 |
| 95-th percentile | 21.13 |
| Maximum | 21.44 |
| Range | 0.65 |
| Interquartile range (IQR) | 0.06 |
Descriptive statistics
| Standard deviation | 0.04023339753 |
|---|---|
| Coefficient of variation (CV) | 0.001910462879 |
| Kurtosis | 0.4649611753 |
| Mean | 21.05950237 |
| Median Absolute Deviation (MAD) | 0.03 |
| Skewness | 0.05199705833 |
| Sum | 1668228.48 |
| Variance | 0.001618726277 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=45)
| Value | Count | Frequency (%) |
| 21.06 | 8419 | |
| 21.05 | 7717 | |
| 21.08 | 7441 | |
| 21.04 | 7207 | |
| 21.07 | 6645 | |
| 21.03 | 6368 | 8.0% |
| 21.09 | 5170 | 6.5% |
| 21.1 | 5126 | 6.5% |
| 21.02 | 4755 | 6.0% |
| 21.01 | 4185 | 5.3% |
| Other values (35) | 16182 |
| Value | Count | Frequency (%) |
| 20.79 | 1 | < 0.1% |
| 20.81 | 1 | < 0.1% |
| 20.88 | 1 | < 0.1% |
| 20.89 | 10 | < 0.1% |
| 20.9 | 22 | < 0.1% |
| 20.91 | 24 | < 0.1% |
| 20.92 | 49 | 0.1% |
| 20.93 | 63 | 0.1% |
| 20.94 | 115 | |
| 20.95 | 176 |
| Value | Count | Frequency (%) |
| 21.44 | 1 | < 0.1% |
| 21.31 | 3 | < 0.1% |
| 21.29 | 2 | < 0.1% |
| 21.28 | 4 | |
| 21.27 | 1 | < 0.1% |
| 21.25 | 2 | < 0.1% |
| 21.24 | 5 | |
| 21.23 | 8 | |
| 21.22 | 6 | |
| 21.21 | 6 |
| Distinct | 54 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 21.20391706 |
| Minimum | 20.8 |
|---|---|
| Maximum | 21.41 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 619.0 KiB |
Quantile statistics
| Minimum | 20.8 |
|---|---|
| 5-th percentile | 21.13 |
| Q1 | 21.17 |
| median | 21.2 |
| Q3 | 21.24 |
| 95-th percentile | 21.28 |
| Maximum | 21.41 |
| Range | 0.61 |
| Interquartile range (IQR) | 0.07 |
Descriptive statistics
| Standard deviation | 0.04727842814 |
|---|---|
| Coefficient of variation (CV) | 0.002229702559 |
| Kurtosis | 0.6533989495 |
| Mean | 21.20391706 |
| Median Absolute Deviation (MAD) | 0.03 |
| Skewness | -0.1374009512 |
| Sum | 1679668.29 |
| Variance | 0.002235249768 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 21.19 | 7530 | 9.5% |
| 21.21 | 7475 | 9.4% |
| 21.2 | 6345 | 8.0% |
| 21.17 | 5883 | 7.4% |
| 21.23 | 5713 | 7.2% |
| 21.18 | 5709 | 7.2% |
| 21.22 | 5583 | 7.0% |
| 21.24 | 5308 | 6.7% |
| 21.16 | 3910 | 4.9% |
| 21.25 | 3750 | 4.7% |
| Other values (44) | 22009 |
| Value | Count | Frequency (%) |
| 20.8 | 1 | < 0.1% |
| 20.84 | 1 | < 0.1% |
| 20.87 | 1 | < 0.1% |
| 20.89 | 1 | < 0.1% |
| 20.9 | 1 | < 0.1% |
| 20.92 | 2 | |
| 20.93 | 1 | < 0.1% |
| 20.95 | 1 | < 0.1% |
| 20.96 | 1 | < 0.1% |
| 20.97 | 3 |
| Value | Count | Frequency (%) |
| 21.41 | 1 | < 0.1% |
| 21.4 | 4 | < 0.1% |
| 21.39 | 7 | < 0.1% |
| 21.38 | 7 | < 0.1% |
| 21.37 | 19 | < 0.1% |
| 21.36 | 25 | < 0.1% |
| 21.35 | 65 | 0.1% |
| 21.34 | 90 | 0.1% |
| 21.33 | 170 | |
| 21.32 | 304 |
| Distinct | 37 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 21.16023266 |
| Minimum | 20.93 |
|---|---|
| Maximum | 21.32 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 619.0 KiB |
Quantile statistics
| Minimum | 20.93 |
|---|---|
| 5-th percentile | 21.1 |
| Q1 | 21.13 |
| median | 21.16 |
| Q3 | 21.19 |
| 95-th percentile | 21.22 |
| Maximum | 21.32 |
| Range | 0.39 |
| Interquartile range (IQR) | 0.06 |
Descriptive statistics
| Standard deviation | 0.04214197817 |
|---|---|
| Coefficient of variation (CV) | 0.001991564972 |
| Kurtosis | -0.1111087146 |
| Mean | 21.16023266 |
| Median Absolute Deviation (MAD) | 0.03 |
| Skewness | -0.3667056325 |
| Sum | 1676207.83 |
| Variance | 0.001775946324 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=37)
| Value | Count | Frequency (%) |
| 21.19 | 8156 | |
| 21.12 | 7209 | |
| 21.14 | 7202 | |
| 21.13 | 6920 | |
| 21.2 | 6738 | |
| 21.21 | 6626 | |
| 21.15 | 5974 | 7.5% |
| 21.18 | 5380 | 6.8% |
| 21.17 | 4686 | 5.9% |
| 21.11 | 3983 | 5.0% |
| Other values (27) | 16341 |
| Value | Count | Frequency (%) |
| 20.93 | 1 | < 0.1% |
| 20.94 | 1 | < 0.1% |
| 20.95 | 4 | < 0.1% |
| 20.96 | 8 | < 0.1% |
| 20.97 | 3 | < 0.1% |
| 20.98 | 8 | < 0.1% |
| 20.99 | 19 | < 0.1% |
| 21 | 30 | < 0.1% |
| 21.01 | 74 | |
| 21.02 | 119 |
| Value | Count | Frequency (%) |
| 21.32 | 1 | < 0.1% |
| 21.28 | 2 | < 0.1% |
| 21.27 | 2 | < 0.1% |
| 21.26 | 19 | < 0.1% |
| 21.25 | 137 | 0.2% |
| 21.24 | 654 | 0.8% |
| 21.23 | 1951 | 2.5% |
| 21.22 | 3610 | |
| 21.21 | 6626 | |
| 21.2 | 6738 |
| Distinct | 40 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.154575901 |
| Minimum | 0 |
|---|---|
| Maximum | 0.42 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 619.0 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0.08 |
| Q1 | 0.12 |
| median | 0.15 |
| Q3 | 0.19 |
| 95-th percentile | 0.23 |
| Maximum | 0.42 |
| Range | 0.42 |
| Interquartile range (IQR) | 0.07 |
Descriptive statistics
| Standard deviation | 0.04718884696 |
|---|---|
| Coefficient of variation (CV) | 0.3052794559 |
| Kurtosis | -0.353650489 |
| Mean | 0.154575901 |
| Median Absolute Deviation (MAD) | 0.03 |
| Skewness | 0.2462375801 |
| Sum | 12244.73 |
| Variance | 0.002226787277 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=40)
| Value | Count | Frequency (%) |
| 0.13 | 6166 | 7.8% |
| 0.12 | 6049 | 7.6% |
| 0.14 | 5833 | 7.4% |
| 0.16 | 5773 | 7.3% |
| 0.15 | 5700 | 7.2% |
| 0.11 | 5465 | 6.9% |
| 0.17 | 5449 | 6.9% |
| 0.18 | 5188 | 6.5% |
| 0.19 | 4763 | 6.0% |
| 0.1 | 4462 | 5.6% |
| Other values (30) | 24367 |
| Value | Count | Frequency (%) |
| 0 | 1 | < 0.1% |
| 0.01 | 3 | < 0.1% |
| 0.02 | 10 | < 0.1% |
| 0.03 | 34 | < 0.1% |
| 0.04 | 124 | 0.2% |
| 0.05 | 308 | 0.4% |
| 0.06 | 646 | 0.8% |
| 0.07 | 1206 | 1.5% |
| 0.08 | 2086 | |
| 0.09 | 3259 |
| Value | Count | Frequency (%) |
| 0.42 | 2 | < 0.1% |
| 0.39 | 2 | < 0.1% |
| 0.38 | 1 | < 0.1% |
| 0.36 | 3 | < 0.1% |
| 0.35 | 3 | < 0.1% |
| 0.34 | 5 | < 0.1% |
| 0.33 | 7 | < 0.1% |
| 0.32 | 10 | < 0.1% |
| 0.31 | 32 | |
| 0.3 | 60 |
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here. A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
First rows
| df_index | X_38 | X_39 | X_40 | X_41 | X_42 | X_43 | X_44 | X_45 | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | -16.41 | -16.36 | -17.03 | 21.20 | 20.99 | 21.28 | 21.09 | 0.29 |
| 1 | 1 | -16.06 | -16.11 | -16.74 | 21.16 | 21.03 | 21.16 | 21.13 | 0.13 |
| 2 | 2 | -16.16 | -16.17 | -16.76 | 21.13 | 21.03 | 21.17 | 21.12 | 0.14 |
| 3 | 3 | -16.05 | -16.03 | -16.67 | 21.18 | 20.98 | 21.20 | 21.09 | 0.22 |
| 4 | 4 | -16.25 | -16.23 | -16.85 | 21.16 | 20.96 | 21.18 | 21.10 | 0.22 |
| 5 | 5 | -16.45 | -16.50 | -17.14 | 21.17 | 21.07 | 21.19 | 21.16 | 0.12 |
| 6 | 6 | -15.60 | -15.58 | -16.23 | 21.18 | 20.99 | 21.20 | 21.10 | 0.21 |
| 7 | 7 | -16.08 | -15.99 | -16.68 | 21.19 | 21.00 | 21.23 | 21.12 | 0.23 |
| 8 | 8 | -15.85 | -15.91 | -16.54 | 21.17 | 21.07 | 21.17 | 21.16 | 0.10 |
| 9 | 9 | -16.03 | -16.00 | -16.63 | 21.13 | 21.00 | 21.17 | 21.10 | 0.17 |
Last rows
| df_index | X_38 | X_39 | X_40 | X_41 | X_42 | X_43 | X_44 | X_45 | |
|---|---|---|---|---|---|---|---|---|---|
| 79205 | 39598 | -15.54 | -15.48 | -16.19 | 21.24 | 21.10 | 21.27 | 21.18 | 0.17 |
| 79206 | 39599 | -15.74 | -15.75 | -16.42 | 21.21 | 21.11 | 21.27 | 21.18 | 0.16 |
| 79207 | 39600 | -16.03 | -16.01 | -16.64 | 21.18 | 21.03 | 21.24 | 21.10 | 0.21 |
| 79208 | 39601 | -16.03 | -16.08 | -16.72 | 21.17 | 21.12 | 21.23 | 21.21 | 0.11 |
| 79209 | 39602 | -16.04 | -16.02 | -16.63 | 21.17 | 21.04 | 21.17 | 21.13 | 0.13 |
| 79210 | 39603 | -16.17 | -16.26 | -16.88 | 21.16 | 21.13 | 21.24 | 21.19 | 0.11 |
| 79211 | 39604 | -16.11 | -16.10 | -16.73 | 21.16 | 21.03 | 21.22 | 21.12 | 0.19 |
| 79212 | 39605 | -16.23 | -16.32 | -16.93 | 21.16 | 21.11 | 21.23 | 21.17 | 0.12 |
| 79213 | 39606 | -15.99 | -16.05 | -16.67 | 21.18 | 21.10 | 21.21 | 21.19 | 0.11 |
| 79214 | 39607 | -15.75 | -15.81 | -16.44 | 21.17 | 21.10 | 21.23 | 21.18 | 0.13 |